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(57) A massively parallel processor is constructed 
from a large number of individual processor 
units connected, first to form a plurality of 
processor sections containing one or more of 
the processor units interconnected for data 
communication by a redundant bus structure. 
In turn, the processor sections are then inter- 
connected in a torroidal configuration to form 
an array of rows and columns in which each 
processor section is coupled to four immediate 
neighbor processor sections by dual communi- 
cation paths, thereby providing at least two 
separate paths for communicating data from 
any one processor unit to any other processor 
unit Each processor unit includes separate 
input/output bus structure which can be used to 
interconnect processor section arrays in a third 
dimension for expansion. 
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BACKGROUND OF THE INVENTION 

The present invention is direct d to data process- 
ing systems, and more particularly to a parallel proc- 
essing environment in which a large number of proc- 
essing units in a network are interconnected in a par- 
allel topology to form a massively parallel processing 
system. 

Parallel processing has found a variety of com- 
mercial applications in today's industry such as, for 
example, in on-line transaction processing to handle 
numerous individual transactions or small tasks are 
distributed among multiple processors to be handled 
quickly. Other parallel processing applications in- 
clude maintaining and.accessing large data bases for 
record-keeping and decision-making operations, or 
as a media servers that provide an accessible store 
of information to many users. Parallel processing's 
particular advantage resides in the ability to handle 
large amounts of diverse data such as, for example, 
in decision making operations which may require 
searches of diverse information that can be scattered 
among a number of storage devices. Or, a parallel 
processor media server application could be in an in- 
teractive service environment such as "movies-on- 
demand." that will call upon the parallel processor to 
provide a vast number of customers with access to a 
large reservoir of motion pictures kept on retrievable 
memory (e.g., disk storage devices). This latter appli- 
cation may well require the parallel processor to si- 
multaneously service multiple requests by locating, 
selecting, and retrieving the requested motion pic- 
tures, and then forwarding the selections to the re- 
questing customers. 

A limiting factor on parallel processing applica- 
tions is the requirement of high availability of the sys- 
tem (e.g., 24 hours a day, 365 days a year) which op- 
erates to limit the size of the system (e.g., the number 
of processor units that make up the system). This lim- 
itation results from the fact that as the number of sys- 
tem components increases, so do the chances of a 
component failure. 

Perhaps a more significant limit on parallel proc- 
essor system size is the number of communications 
paths available for accessing and moving the large 
amounts of data often encountered in parallel proc- 
essing environments. And, the problem of limited 
throughput can be exacerbated as the number of in- 
dividual processor units of the system increases so 
that massive parallel processing systems are ineffec- 
tive in uses requiring searching, movement and/or 
communication large amounts of data. Where a small 
number of communication paths can act to limit the 
amount of proc ssing speed and power that can be 
afforded by parallel proc ssing techniques, increas- 
ing th number of communication paths tends to in- 
crease th risk that component failure will bring down 
significant portions of the parall I processor, if not the 



entir parallel processor. 

Accordingly, there is needed a technique, and an 
architecture, for interconnecting large pluralities of 
processor units to form a massively parallel process- 
5 ing system that provides each processor unit with 
high availability and a useable bandwidth for access- 
ing and providing data maintained by the system. 

The present invention is designed to incorporate 
presently available, off-the-shelf, elements to inter- 
to connect a plurality of individual processor units to 
form a massively parallel processor that can be much 
less expensive than, for example, conventional, so 
called "top-of-the-line" supercomputers, provide as 
much or more computing power with much higher per- 
15 formance. Further, the interconnections, forming data 
communication paths between groups of processor 
units, are redundant so that no single component fail- 
ure will operate to terminate use of any portion of the . 
parallel processor. In fact, as will be seen, the inter- 
20 connecting network topology between these groups 
of processor units provide a multitude of data commu- 
nication paths between anyone of the groups of proc- 
essor units and any other group so that loss of an en- 
tire redundant communication path will not signifi- 
25 cantty affect performance or operation of the parallel 
processor. 

Broadly, the invention is directed to interconnect- 
ing a large multiple of self-contained processor units 
(e.g., each with their own memory systems, input/out- 
30 put (I/O), peripheral devices, etc.) in a manner that 
provides each processor unit with at least two data 
paths to any other processor unit. In one construction 
of the invention, small numbers (e.g., one or more, up 
to four) of the processor units are interconnected to 
35 one another, forming "processor sections," by a re- 
dundant bus structure. The processor sections, in 
turn, are interconnected by dual ring data communi- 
cation paths, forming a row-column array of proces- 
sor sections in which each processor section of the 
40 array is provided with two direct communication paths 
to each of its four immediate neighbors (including 
those processor sections which are located at the 
edge or peripheries of the array which are coupled to 
processor sections at the opposite peripheries). Each 
45 processor section is thereby provided with at least 
four data communication paths to any other proces- 
sor section of the array. 

The result is a torroidal interconnection of multi- 
ple processor units forming a massively parallel proc- 
so essor with fault tolerant, high bandwidth data commu- 
nication paths from any one processor unit to any 
other processor unit of the parallel processor. 

In a further construction of the invention af irst ar- 
ray of processor sections, interconnected as descri- 
55 b d above, can be coupl d to a similarly constructed 
second array of processor sections, using the I/O 
paths availabl to one or more of the processor units 
of each processor section. Thereby, a three- 
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dimensional array of multipl processor units is used 
to obtain the massively parallel processor. 

In the disclosed embodiment of the invention, the 
processor units of each processor section are inter- 
connected by a dual interprocessor bus structure for 
communicating data therebetween generally accord- 
ing to the teachings of U.S. Patent No. 4.228,496, al- 
though those skilled in this art will see that other in- 
terprocessor connections may be used for processor 
units within the processor sections. Connectivity be- 
tween processor sections in one direction of the array 
(e.g., the columns) is preferably accomplished using 
the apparatus and methods taught by U.S. Patent 
Nos. 4,667,287 and 4,663,706, whose teachings are 
incorporated herein be reference. Connectivity be- 
tween processor sections in the other direction of the 
array (e.g., the rows) is preferably through use of the 
apparatus and method taught by U.S. patent applica- 
tion Ser. No. 07/599,325, filed October 17, 1990, and 
assigned to the assignee of the present application, 
which is incorporated herein by reference. 

Interprocessor data communications within any 
processor section uses the interprocessor bus struc- 
ture. Data communications between any processor 
unit of one processor section and a processor unit of 
another processor section will be first by the interpro- 
cessor bus to the appropriate connection apparatus, 
then via interconnecting links to the processor section 
containing the destination processor unit, and on the 
interprocessor bus structure to the destination proc- 
essor unit When data is communicated between 
processor units of different processor sections, the 
interconnection apparatus will select the shortest of 
two possible paths provided by the interconnecting 
ring for transmission to the destination processor sec- 
tion and processor unit When, however, data is to be 
communicated between a processor sections con- 
tained in different rows of the array, the data is com- 
municated by first routing the message "vertically" 
(i.e., from row to row), until the row containing the 
destination processor unit is encountered. Then, the 
message is routed "horizontally" (i.e., within that row) 
to the destination processor section (and the destin- 
ation processor unit it contains). Again, the shortest 
possible paths for both the row- to-row route, and with- 
in the destination row. are selected. 

A number of advantages are realized by the pres- 
ent invention. First and foremost is the fact that the in- 
terconnection of the multiple processor units is fault 
tolerant; a fault in a data communication path need 
not bring down the system, or a significant part of the 
system. Should a data communication path fail, other 
paths are available. Thus, the present invention pro- 
vides a highly available parallel processor. 

Another advantage of the present invention is 
that throughput is significantly increased. Since th 
number of interconnections between the groups of in- 
dividual processors units forming the parallel proces- 



sor system is increased, data traffic is also incr ased, 
in turn increasing data input/output through the sys- 
tem. 

These and other advantag s will become appa- 
5 rent to those skilled in this art upon a reading of the 
following detailed description of the invention, which 
should be taken in conjunction with the accompany- 
ing drawings. 

Fig. 1 is a simplified representation of an array of 
10 processor sections, each containing one or more 

processor units, interconnected according to the 

present invention, forming a massively parallel 

processor system; 

Fig. 2 is a diagrammatic illustration of one of the 
15 processor sections, showing one method of inter- 

connecting the processor units that make up the 
processor section, and illustrating the apparatus 
used to interconnect the processor section other • 
the processor sections of Fig. 1 ; 
20 Fig. 3 is a simplified diagram of two four- 

processor section arrays interconnected to form 
a three-dimensional parallel processor array; and 
Fig. 4 is an alternate embodiment of the inven- 
tion, illustrating, in simplified form, an array of a 
25 paired- processor processing units interconnect- 

ed by multi-ported input/output routers to form a 
massively parallel processor according to the 
present invention. 

Turning now to the figures, and for the moment 

30 principally Fig. 1 , illustrated in simplified form is a par- 
allel processor system, designated generally with the, 
reference numeral 10. As shown, the parallel proces- 
sor system 1 0 comprises a plurality of processor sec- 
tions 12, each of which contain one or more processor 

35 units. In the context of the present invention, it is pre- 
ferred that the number of processor units contained 
in each processor section 12 contains be limited to 
four, although a principle reason is to keep from over- 
loading the bus structure used to communicate data 

40 between the processor units in a processor section. 

Continuing with Fig. 1, the processor sections 12 
are interconnected by a horizontal communication 
paths 14 in ring-like configurations, forming a plural- 
ity of processor section rows 15. In similar fashion 

45 communication paths 16 interconnect the processor 
sections 12 in a vertical direction (as viewed in fig. 1) 
in ring-like manner to form columns 17 of processor 
sections 1 2. As shown in Fig. 1 , each of the commu- 
nication paths 14, 16 are redundant providing there- 

50 by a pair of communication paths in any direction for 
each of the processor sections 12. Thus, for example, 
the communication path 14, provides the processor 
section 12! with four separate routes for communicat- 
ing data to any of the oth r processor sections 12 2 

55 and 12 3 in row *\5) : two that directly connect the proc- 
essor section 1 2^ to its imm diate neighbor processor 
section 12 2 , and two that directly connect the proces- 
sor section 12, to its other immediate neighbor (within 



3 



5 



EP 0 669 584 A2 



6 



row 1 5 n ) processor section 12 3 . Should the processor 
section 12-, need to s nd a data communication to the 
processor section 1 2 3 , it can do so by either one of the 
four routes: two directly, or two via the processor sec- 
tion 12 2 . In practice, such a communication will be 
conducted using the shortest possible path, here the 
direct connection, if available. 

In similar fashion processor sections 12 are inter- 
connected in the direction of columns 17 by the com- 
munication paths 16 which are also redundantly con- 
structed. Thus, as in the direction of the rows 1 5, each 
processor section 12 has effectively four communica- 
tion paths to any processor section 12 in that same 
column 17. 

The ring interconnections provided by the com- 
munication paths 14 and 16 provide each processor 
section 12 with a number of message routes to any 
other processor section 12. For example, the proces- 
sor section 12, may send message traffic to the proc- 
essor section 12 8 , using any combination of the vert- 
ical paths 16,, 16 2 and horizontal communication 
paths 14 1f 14 2 , 14 3 . The preferred routing, for simpli- 
fication, is to first route message data from the proc- 
essor section along a column communication 
path (i.e., communication path 16^ until the row 15 
containing the destination processor section 12 of the 
data is reached; then the data is routed along the 
communications path 14 3 containing the destination 
processor section 12 to receive the data, processor 
section 12 8 . 

Further, the shortest possible routes from the 
processor section *\2, to the processor section 12 8 
would selected. Thus, rather than communicating the 
message data from the processor section 1 2-i to the 
processor section 12 4 , and from there to the proces- 
sor section 1 2 7 of the row 15 3 , the data would be rout- 
ed aiong a path that communicates it directly from the 
processor section M A to the processor section 12 7 . 
When received at the destination row 1 5 3 by the proc- 
essor section 12 7 ) the message data will again be 
routed along the shortest horizontal path to the des- 
tination processor section 12 6 (i.e., directly from the 
processor section 12 7 to the processor section 12 8 , 
rather than via the processor section 12 Q and then to 
the processor section 12 8 ).Thus, not only does the in- 
vention provide additional paths for data communica- 
tion between the multiple processor units of the par- 
allel processor system 10, but communications are 
conducted in a way that assures that the shortest 
route is taken from the sending processor unit (proc- 
essor section) to the destination processor unit (proc- 
essor section). 

Turning now to Fig. 2, a processor section, 20 n is 
shown containing four substantially identically con- 
structed processor units 20 interconnect d by an in- 
terprocessor structure bus 24 for communicating 
data therebetween as taught by the aforementioned 
U.S. Pat. No. 4,228,496. The processor section may 



contain less processor units 20 (e.g., 1-3), and proc- 
essor sections 1 2 of a row 1 5 or column 1 7 may have 
different numbers of processor units. 

Continuing with Fig. 2, each of the processor 

5 units 20 has an input/output (I/O) system that in- 
cludes an I/O buss 22 connecting the processor unit 
20 to various peripherals and/or peripheral controllers 
such as the disk controller 26 that provides the proc- 
essor unit 20' access to the disk storage units 27. As 

10 taught by the '496 patent, the disk controller 26 has 
a second port to provide access to, and control over, 
the storage units 27 to another of the processor units 
20. Each of the processor units 20 are interconnected 
by the bus structure 24. Although the bus structure 

15 24 is shown as being separate from the bus structure 
22, it will be evident to those skilled in the art that, us- 
ing today's high speed technology (e.g., microproces- 
sors, or microprocessor controlled channels, etc.) 
that interprocessor communications could be con- 

20 ducted via interconnecting the I/O busses of the proc- 
essors 20 using appropriate communication control- 
lers as is done in a second embodiment of the inven- 
tion illustrated in Fig. 4, and discussed below. 

However, there are advantages to using separate 

25 bus structure 24 for interprocessor communication, 
one of which is the availability of connection appara- 
tus designed to operate with such bus structure 24 for 
interconnecting processor sections in the manner 
shown in Fig. 1. The bus structure 24 typically, as 

30 taught by the '496 patent, is implemented with redun- 
dant busses 24a, 24b, to provide a fault tolerant ca- 
pability as well as increased message traffic band- 
width. 

Each of the pair of row and the pair of column 
35 communication paths that couple the processor s c- 
tion 12 n to its four immediate neighbors will connect 
to a corresponding one of the interprocessor bus 24a, 
24b. Thus, the interprocessor bus 24a that communi- 
cates processors 20 to one another is coupled to one 
40 of the pair of row communication paths 14 na by row in- 
terface unit 30a. Acolumn interface unit 32a connects 
the interprocessor bus 24a to one of the pair of vert- 
ical or column communication paths 16 na .ln similar 
fashion, the redundant interprocessor bus 24b con- 
45 nects to the other of the row communication paths 
1 4 nb by a row interface unit 30b, while the other of the 
column communication paths 16 nb is connected to the 
interprocessor bus 24b by a column interface unit 
32b. 

so The structure and operation of the row interface 

units 30a, 30b are taught by U.S. Patent Application 
Ser. No. 07/599,325, filed October 17, 1990. U.S.Pat. 
Nos. 4,667,287 and 4,663,706 teach the structure 
and operation of the vertical interface units 32a, 32b. 

55 Both row and column interface units 30, 32 are struc- 
tured to use serial fiber optic links to form the dual 
communication paths 14, 16. It will b evident that bit 
parail I communication paths could also be imple- 
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merited. 

Each processor unit 20 is responsible for han- 
dling communications from its processor section to 
another processor section containing the destination 
processor unit 20 in the vertical direction (i.e., along 5 
the column communication paths 16). Each proces- 
sor unit 20 maintains a table identifying which proc- 
essor units 20 form a part of its own sub-system 15, 
and in which sub-systems they are located. However, 
only one processor unit 20 of each processor section 10 
12 is responsible for obtaining and maintaining (up- 
dating) the information used to develop the tables by 
the other processor units 20 of the processor section 
12. 

The information is gathered as follows. Each re- 15 
sponsible processor unit 20 of each processor section 
12 will develop an inquiry message that is sent to its 
immediate neighbor processor sections 12, request- 
ing them to identify their immediate neighbors, the di- 
rection of such neighbors (i.e., which is on the "right," 20 
and which is on the left"), and information as to the 
make-up of the rows 15 such as what processor units 
are in such row 1 5. When the interrogating processor 
receives back responses to its inquiry messages, it 
will then formulate similar inquiries that are sent be- 25 
yond the immediate neighbors to the processor sec- 
tions on the other side of the immediate neighbors. 
Again, the responses received back are used to send 
additional message beyond immediate neighbors so 
far identified by the received responses to the inquir- 30 
ies to still further neighbors, and this process contin- 
ues until the responses begin to identify to the inter- 
rogating processor unit 20 processor sections (and 
row make-ups) already known to it. At that point the 
interrogating process stops, and the information gath- 35 
ered thereby is distributed to the other processor 
units 20 of the processor section 1 2 containing the in- 
terrogating processor unit 20. Periodically this inter- 
rogating process is re-initiated to ensure that the view 
of the system obtained since the last interrogation 40 
process has not changed, or if it has, what those 
changes are. Any changes are used to update the ta- 
bles maintained by each processor unit 20. 

A similar procedure is used for the row communi- 
cation paths 14 within each system to determine 45 
where, and in which (shortest) direction each proces- 
sor unit 20 is located, relative to any particular proc- 
essor section 12. As described in U.S. patent applica- 
tion Ser. No. 07/599,325, a maintenance diagnostic 
system (MDS; not shown herein) forms a part of each so 
of the rows 15, and is connected to each processor 
unit 20 and row interface unit 30 of that row. Among 
the various tasks of the MDS (not shown herein) is the 
responsibility to interrogate each processor unit 20 of 
a row 15 with which the MDS system is associated to 55 
determin what processor units are contained in th 
row and associat d with what processor s ctions. 
This information is written to the row interface units 



30 of the row, and used, when messages are to b 
sent from one processor unit 20 to another in a dif- 
ferent processor section 12 of a row 1 5, to select a di- 
rection (route) along the row communication path 16 
that is the most direct to the processor section 12 con- 
taining the destination processor unit 20. 

Once the parallel processor system 10 has been 
brought up, and the various interrogations completed 
to determine where various processor units are, th 
system 1 0 operates generally as follows. Assume that 
Fig. 2 is an illustration of the processor section 12$ 
(Fig. 1), and that one of the processor units 20 desires 
to send information to, for example, a destination 
processor unit 20 of the processor section 12 2 . The 
sending processor unit 20 will create a message ac- 
cording to a predetermined format that identifies the 
destination processor unit 20 of processor section 12 2 
by the row in which it is contained, and the processor 
unit* s identification within that row. The sending proc- 
essor unit 20 then transmits the message onto one of 
the two interprocessor busses 24a, 24b. Since, ac- 
cording to protocol, the message will be transmitted 
first vertically, or along a column communication path 
16, the column interface unit 32 will recognize the ad- 
dress of the message as being destined outside th 
particular row and will capture the message for trans- 
mission on the corresponding column communication 
path 16 n . In doing so, the column interface unit 32 will 
select the shortest path to the row 15 containing th 
destination processor unit 20, based upon the ad- 
dress contained in the message, and transmit the 
message on the selected path. 

The transmitted message will be received by the 
column interface 32 associated with the processor 
section 12 3 , recognized as being for a destination 
processor unit that is in the same row 1 5 as that of th 
processor section 12 3 , and couple the message to th 
interprocessor bus 24 of that processor section 12. 
Since the message identification (address) will not 
identify the any of the processor units 20 of the proc- 
essor section 12 3 , but does identify a processor in th 
same sub-system 1 5 as that of processor section 1 2 3 , 
the row interface unit 30 of that processor section will 
pick up the message from the interprocessor bus 24. 
The row interface unit 30 will determine from the ad- 
dress contained in the message which direction to 
transmit the message on the row communication path 
^4<^ for the shortest path to the destination processor 
unit 20, and send it along that selected path to the 
processor section 12 2 . There, the associated row in- 
terface unit 30 will communicate the message to the 
interprocessor bus 24 where it is then received by the 
destination processor unit 20 coupled thereto. 

The parallel processor system 10, described 
above, us s a two-dimensional torroidal n twork 
configuration to interconnect th proc ssor sections 
1 2 of Fig. 1 in rows and columns, forming the row and 
column communication paths 14 and 16, respective- 
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|y. However, the network can be extended to three di- 
mensions, as iliustrat d in Fig. 3. 

As Fig. 3 illustrates, in more simplified form (for 
reasons of clarity), a parallel processor 60 includes a 
plurality of processor sections 62 organized by com- 
munication paths as described above in two separate 
planes A, B. The processor sections 62 of each plane 
are interconnected by row and column ring communi- 
cation paths 64, 66, respectively, to form a tonroidal 
network configuration within each plane. More spe- 
cifically, the plane A comprises processors 62i ... 62^ 
interconnected by row communication paths 64^ 64 2 
and column communication paths 66-,, 66 2 . In similar 
fashion, the processor section plane B comprises the 
processor sections 62 5 ... 62 8 interconnected by the 
row and column communication paths 64 3 , 64 4 and 
66 3 , 664, respectively. 

In addition to the interconnecting row and column 
communication paths 64, 66, each processor section 
62 of each plane A, B is also coupled by communica- 
tion paths 68 to corresponding processor sections 62 
of the other plane, forming in effect a three- 
dimensional parallel processor 60. The path connec- 
tions between each processor section 62 and the cor- 
responding row and column communication paths 64, 
68 are the same as described above (i.e., they use the 
row and column interface units 30, 32, Fig. 2). The 
path connections for the paths 68 are established us- 
ing the I/O system of the processor units 20 and a 
communications controller. Thus, for example, as- 
sume that the processor section 12 n of Fig. 2 is the 
processor section 62, of Fig. 3. For the configuration 
shown in Fig. 3, a processor unit 20 is provided a com- 
munications controller 70 (illustrated in phantom in 
Fig. 2) that connects to a processor unit 20 (not 
shown) in the processor section 62 5 to provide the 
bridge between the two processor sections 62-, , 62 5f 
and associated planes A, B. 

Message traffic within the individual planes of a 
three dimensional torroidal configuration would be 
the same as described above: message traffic would 
first be sent along a vertical communication path 66 
until it reached the horizontal row containing the des- 
tination processor unit. Then, the message would be 
send horizontally until it reached the communications 
path 68 containing the processor section 62 having 
the destination processor unit Finally, the message 
would be transmitted via the associated communica- 
tions controllers from one processor section 62 to the 
other, where it would be put on the associated inter- 
processor bus (unless the message was for the par- 
ticular processor unit responsible for maintaining the 
bridge between the planes A, B). (Although the par- 
all I processor 60 shown in Fig. 3 contains only four 
processor s ctions 62 in each plan A, B, this is done 
for ease of description; th planes themselv s could 
easily b expanded, as could the third dim nsion as 
will be vident to thos skilled in this art.) Thus, if the 



destination processor unit is in another plane from 
that of the sender processor, message traffic would 
still travel first along a column communication path, 
then in along a row communication path, and then 
5 across from one of the planes A, B to the other. 

The aforementioned discussion of the invention 
has illustrated its use in connection with a particular 
architecture: processor sections in which the individ- 
ual processor units are connected by an interproces- 
10 sor bus. The invention is susceptible for use in other 
architectures, however, albeit preferably fault tolerant 
architectures. Thus, for example, Fig. 4 illustrates the 
invention in connection with a different architectur 
that is more discussed in more detail in U.S. patent 
15 application Ser. No. 07/992,944, filed December 17, 
1992, and assigned to the assignee of this applica- 
tion, the disclosure of which is incorporated herein by 
reference also. 

As disclosed in the aforementioned application, 
20 and as illustrated generally in Fig. 4, a computer ar- 
chitecture comprises pairs 102 of central processing 
units (CPUs) 104, that may operate in lock step fash- 
ion, or individually, to perform data processing activ- 
ities. Each CPU 104 has an input/output system that 
25 is accessed by the CPU through multi-ported routers 
108. In addition, each CPU 104 (e.g., CPU 104a) is 
connected to the router (e.g., 108b) of its sibling CPU 
(e.g., 1 04b) and, therefore, has access to the I/O sys- 
tem of that sibling. Conversely, the sibling CPU 
30 (104b) is connected through the router 108a so that 
it has access to the I/O system of its sibling, CPU . 
104a. 

Fig. 4, shows a parallel processor 100 comprising 
four sets of CPU pairs 102, each CPU pair comprising 
35 the two CPUs 104a, 104b. Each CPU 104 has an in- 
dividual input/output system that is accessed through 
an associated router 108 by one of two bus connec- 
tions 106. The bus connections of CPU 104a has a 
bus pair 1 06a connecting it to routers 1 08a, 1 08b. The 

40 router 1 08a provides the CPU 1 04a with access to its 
own I/O system (not shown). Similarly, the router 
108b provides the CPU 104b with access to its I/O 
system. In addition, the routers 104a, 104b resp c- 
tively provide CPUs 1 04b, 1 04a with access to the I/O 

45 system of the other. The other CPU pairs 1 02 2 , 102 3 , 
1024 are, as Fig. 4 shows, similarly constructed. 

The routers 108 are multi-ported devices, provid- 
ing at each port bi-directional communication inter- 
faces. Thus, each of the buses connecting the CPU 

50 1 04a to the routers 1 08a, 1 08b is a bi-directional bus 
configuration, permitting two-way communication of 
data. 

The routers 108 are designed to have six bidirec- 
tional ports. When used as the access point to the I/O 
55 system of a particular CPU, two of the ports are used 
to conn ct to the CPU pair 1 02; the other four are free 
for oth r interconnections. Thus, one of the bi-direc- 
tional ports of th routers 1 08 may be interconnected 
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by bus paths 114 forming multi- processor rows A'. An- 
other port of one router associated with each CPU 
pair 102 connects by bus paths 116 to form the col- 
umns B' as illustrated in Fig. 4. Thereby, the torroidal 
network configuration is attained in the same fashion 
as the processor sections 12 where interconnected 
by the communication paths 14, 16 of Fig. 1. 

A similarly constructed and interconnected array 
of CPU pairs 102 could be connected, using other 
ports of the routers 108, in order to form a three- 
dimensional torroidal network array connection such 
as that illustrated in Fig. 3. Further, the architecture 
of Fig. 4 lends itself to being expanded much easier 
using routers 108, as more particularly discussed in 
the aforementioned application (Ser No. 07/992,944). 

Having not described the present invention in the 
context of two equivalent parallel processor architec- 
tures, the advantages of the dual-torroidal network in- 
terconnection of the processor sections should now 
be evident. Of particular importance is the fact that 
failure of any single data communication, or any com- 
ponent in a communication path, path between any 
pair of the processor units will not inhibit or destroy 
communication between that pair of processor units. 
Further, communication between processor sections 
is capable of withstanding loss of both direct commu- 
nication paths connecting neighboring processor sys- 
tems. 

While a full and complete disclosure of the inven- 
tion has been provided herein above, it will be obvious 
to those skilled in the art that various modifications 
and changes may be made. 



Claims 

1. a multiple processor system, comprising: 

a plurality of processor sections, each of 
the plurality of processor sections having one or 
more processors units and means interconnect- 
ing the one or more processors units for commu- 
nicating data therebetween; 

means for interconnecting the plurality of 
processor sections in an array of a first number 
of rows of processor sections and a second num- 
ber of columns of processor sections in a manner 
that establishes for each row and each column of 
processor sections a circular communication 
path for communicating data form any one of the 
plurality of processor sections to any other of the 
plurality of processor sections in the array; 

whereby the interconnecting means pro- 
vides at least two communication paths between 
any one of the plurality of processor sections and 
four immediate neighbor ones of the plurality of 
processor sections. 

2. The multiple processor system of claim 1 , where- 
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in the means interconnecting the processor units 
of each of each of the plurality of processor sec- 
tions includes redundant bus means coupled to 
each of said processor elements. 

3. The multiple processor system of claim 2, where- 
in the interconnecting means is coupled is cou- 
pled to the bus means of each corresponding one 
of the plurality of processor sections. 



4. A multiple processor system, comprising: 

a first plurality of processor sections, each 
of the processor sections comprising: 

one or more processor elements, and 
15 means interconnecting the processor ele- 

ments for communicating data therebetween; 

means for interconnecting said sections to 
form at least first and second rows of processor 
sections; and 

20 means for interconnecting corresponding 

ones of the sections of each of the first and sec- 
ond rows to form a dual ring-like communication 
path that provides a dual communication path for 
each section of each for the first and second rows 

25 to each of the other of the first and second rows. 

5. A method of forming a massively parallel process- 
ing system, comprising the steps of: 

providing plurality of processor sections, 

30 each of the processor sections including at least 

a pair of processor elements interconnected for 
communicating data therebetween; 

interconnecting the processor sections in 
a manner that forms a first number of processor 

35 section groups, each of the number of processor 

section groups including means forming a ring 
data communication path for communicating the 
processor section of such processor section 
group to one another by two data communication 

40 paths; and 

interconnecting a second number of corre- 
sponding ones of processor sections, each of the 
second number of corresponding ones of proces- 
sor sections being contained in a corresponding 

45 one of the first number of processor section 

groups, in a manner that forms two communica- 
tion paths for communicating data between th 
processor sections of each of the second num- 
ber. 

50 

6. A parallel processor system, comprising: 

a number of parallel processor sub-sys- 
tems, each of the number of parallel processor 
sub-systems including a plurality of processor 
55 units grouped in processor sections, each of the 

processor sections having at least one of the plur- 
ality of processor units; 

first means forming a ring data communi- 
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cation path between the processor units of each 
of the number of parallel proc ssor sub-systems 
to data communication thereb tween; and 

second means interconnecting corre- 
sponding ones of the number of processor sec- 5 
tions of the parallel processor sub-systems in a 
ring configuration to communicating data there- 
between. 

7. The parallel processor system of claim 6, wherein 10 
the first and second means include fiber-optic 
data paths for communicating data. 

8. A parallel processing system, comprising: 

a plurality of processor sections, each of is 
the processor sections including at least a pair of 
processor units coupled to one another by proc- 
essor bus means for interprocessor communica- 
tion; 

means for interconnecting the plurality of 20 
processor sections in an array arranging the plur- 
ality of processor sections in rows and columns, 
the interconnecting means including a pair of bus 
means coupling each one of the plurality of proc- 
essor sections to each of four other of the plurality 25 
of processor sections for data communication to 
that the processor sections of each row and each 
column are coupled together by the interconnect- 
ing means in dual data communicating ring con- 
figurations. 30 

9. The parallel processing system of claim 8, where- 
in the bus means for coupling the processor units 
of each processor section includes at least first 
and second bus means. 35 

1 0. The parallel processing system of claim 9, includ- 
ing circuit means for coupling the first and second 
bus means to a corresponding one of the pair of 

bus means. 40 



45 



50 
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FIG. 2. 
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FIG. 3. 
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